perm filename CH2.5[206,JMC] blob sn#070504 filedate 1973-11-06 generic text, type T, neo UTF8
00100	\\M0NGR25;\M1BDJ25;\M2BDR25X;\M3SUB;\M4NGR30;\M5NGR40;\M6SUP;\F0
00200	
00300	\F45.Tree search.\F0
00400	
00500	\J	We begin with a general depth first tree search function.
00600	It can be used to search specific trees of possibilities by defining
00700	three auxiliary functions in a way that depends on the application.
00800	We have\.
00900	
01000		\F1search p ← \F2if \F1lose p \F2then \F0LOSE\;
01100	 \F2else if \F1ter p \F2then \F1p \F2else \F1searchlis[successors p]\F0
01200	
01300	where
01400	
01500		\F1searchlis u ← \F2 if n \F1u \F2then \F0LOSE \F2else \F1{search \F2a\;
01600	 u}[λx.\F2if \F1x = \F0LOSE \F2then \F1searchlis \F2d \F1u \F2else \F1x].\F0
01700	
01800	\JIn the applications, we  start with a position \F1p\F30\F0,  and we
01900	are looking for  a win in the successor tree of \F1p\F30\F0.  Certain
02000	positions lose and  there is  no point looking  at their  successors.
02100	This is decided  by the predicate \F1lose\F0. A position  is a win if
02200	it  doesn't  lose  and  it  satisfies  the  predicate \F1ter\F0.  The
02300	successors of a position  is given by the  function \F1successors\F0,
02400	and  the  value  of \F1search  p\F0  is  the  winning position.    No
02500	non-losing position should have the  name LOSE or the function  won't
02600	work properly.
02700	
02800		Our first example is  finding a path from an  initial node to
02900	a final node  in a graph represented by a list structure as described
03000	in chapter I.   A position is a path  starting from the initial  node
03100	and  continuing to  some intermediate  node and  is represented  by a
03200	list  of its nodes  in reverse order.   The three  functions for this
03300	application are\.
03400	
03500		\F1lose p ← \F2a \F1p \F0ε \F2d \F1p,
03600	
03700		ter p ← [\F2a \F1p = final],\F0
03800	
03900	and
04000	
04100		\F1successors p ← mapcar[\F2d \F1assoc[\F2a \F1p,graph],λx.x.p].\F0
04200	
04300	\J	Another example is the so-called \F1Instant Insanity\F0 puzzle.
04400	There are four cubical blocks, and each face of each block is colored with
04500	one of four colors.  The object of the puzzle is to build a tower of all four
04600	blocks such that each vertical face of the tower involves all four colors.
04700	In order to use the above defined function \F1search\F0 for this purpose,
04800	we must define the representation of positions and give the functions
04900	\F1lose, ter, \F0and \F1successors\F0.  A position is represented by a list
05000	of lists - one for each face of the tower.  Each sublist is the list of colors
05100	of the faces of the blocks showing in that face.  We shall assume that the
05200	blocks are described in the following longwinded but convenient way.  (We'll
05300	take up precomputing this description later.)  For each block there is a list
05400	of the 24 orientations of the block where each orientation is described as
05500	a list of the colors around the vertical faces of the block when it is in that
05600	orientation.  Thus the puzzle is described by a list of lists of lists which
05700	we shall call \F1puzz\F0.\.
05800	
05900		We now have
06000	
06100		\F1p\F30\F0 = (NIL NIL NIL NIL),
06200	
06300		\F1lose p ← orlis[p,λu.\F2a \F1u \F0ε \F2d \F1u],
06400	
06500		ter p ← [length \F2a \F1p = 4]\F0,
06600	
06700	and
06800	
06900		\F1successors p ← mapcar[\F2a \F1nth[puzz,1 + length \F2a \F1p]\;
07000	,λx.mapcar2[p,x,λyz.z.y]],\F0
07100	
07200	where
07300	
07400		\F1mapcar2[u,v,f] ← \F1if n \F1u \F2then \F0NIL \F2else \;
07500	f[\F2a \F1u,\F2a \F1v] . mapcar2[\F2d \F1u,\F2d \F1v,f].\F0
07600	
07700	\J	Getting the initial position in the desired form is as complicated
07800	a computation as the actual tree search.  It can be
07900	conveniently done by a sequence of assignment
08000	statements starting with a description of the blocks:\.
08100	
08200		\F1puzz1 ← \F0((G B B W R G) (G G B G W R) (G W W R B R) (G G R B W W)).
08300	
08400	\JHere each block is represented by a list of the colors of the faces starting
08500	with the top face, going around the sides in a clockwise direction and
08600	finishing with the bottom face.
08700	
08800		We need to go from this description of the blocks to a list of the
08900	possible cycles of colors on the vertical faces for the 24 orientations of
09000	the block.  This not easy, because the order in which we have given the colors
09100	is not invariant under rotations of the block.  An easy way out is to start with
09200	a block whose faces are assigned the numbers 1 thru 6 starting with the top,
09300	going clockwise around the sides and finishing with the bottom.  We write down
09400	one cycle of side colors for each choice of the face put on top and get the
09500	list of all 24 cycles by appending the results of generating the cyclic
09600	permutations of the cycles.  All this is accomplished by the assignment\.
09700	
09800	
09900		\F1puzz2 ← cycles[\F0(2 3 4 5)]*\F1cycles[(\F0(2 5 4 3)]*\;
10000	\F1cycles[(1 2 6 4)]
10100		          *cycles[(1 4 6 2)]*cycles[(1 3 6 5)]*cycles[(1 5 6 3)],\F0
10200	
10300	where the function \F1cycles\F0 is defined by
10400	
10500		\F1cycles u ← maplist[u,λx.x*upto[u,x]]\F0
10600	
10700	with the auxiliary function
10800	
10900		\F1upto[u,x] ← \F2if \F1x \F2eq \F1u \F2then \F0NIL \F2else a\;
11000	 \F1u . upto[\F2d \F1u,x].\F0
11100	
11200	\JNext we create for each block a list of substitutions expressing the
11300	colors of the six numbered faces.  We have\.
11400	
11500		\F1puzz3 ← mapcar[puzz1,λx.prup[\F0(1 2 3 4 5 6),\F1x]],\F0
11600	
11700	\Jand we use these substitutions to get for each block the list of 24 orientations
11800	of the block. Thus\.
11900	
12000		\F1puzz4 ← mapcar[puzz3,λs.sublis[s,puzz3]].
12100	
12200	\Jpuzz4\F0 has all 24 orientations of the first block while for symmetry reasons
12300	we need only consider three as distinct, say the first, ninth, and seventeen.  So
12400	we finally get\.
12500	
12600		\F1puzz ← (\F2a \F1nth[\F2a \F1puzz4,1] \F2a \F1nth[\F2a \F1puzz4,9] \;
12700	\F2a \F1nth[\F2a \F1puzz4,17]) . \F2d \F1puzz4.\F0
12800	
12900	\JThe program when compiled runs about 11 seconds on the PDP-10.
13000	
13100		A more sophisticated representation of face cycles and partial towers
13200	makes a factor of ten speedup without changing the basic search algorithm.  A
13300	face cycle is represented by 16 bits in a word four for each face a particular
13400	one of which being turned on tells us the color of the face.  If we \F2or\F0
13500	these words together for the blocks in a partial tower we get a word which
13600	tells us for each face of the tower what colors have been used.  A
13700	particular face cycle from the next block can be added to the tower if
13800	the \F2and\F0 of its word with the word representing the tower is zero.
13900	We represent a position by a list of words representing its partial
14000	towers with 0 as the last element and the highest partial tower as the
14100	first element.  The virtue of this representation is that it makes
14200	the description of the algorithm short.
14300	The initial position is (0).  The new \F1puzz\F0 can be formed from the
14400	old one by the assignment\.
14500	
14600		\F1puzza ← mapcar[puzz,λx.mapcar[x,zap]],\F0
14700	
14800	where
14900	
15000		\F1zap v ← \F2if n \F1v \F2then \F10 \F2else \F1poo \F2a \F1v +\;
15100	16zap \F2d \F1v,\F0
15200	
15300	and
15400	
15500		\F1poo x ← \F2if \F1x=\F0R \F2then \F11 \F2else if \F1x=\F0W \F2then\;
15600	 \F02 \F2else if \F1x=\F0G \F2then \F04 \F2else \F08.
15700	
15800	\JNow we need the new functions \F1lose, ter, \F0and
15900	\F1successors\F0.  These are\.
16000	
16100		\F1lose p ← \F2false\F1,
16200	
16300		\F1ter p ← [length p = 5],\F0
16400	
16500	and
16600	
16700		\F1successors p ← mapchoose[\F2a \F1nth[puzza,length p],\;
16800	λx.\F2a \F1p \F2and \F1x = 0,λx. [\F2a \F1p \F2or \F1x] . p].\F0
16900	
17000	where
17100	
17200		\F1mapchoose[u,pred,fn] ← \F2if n \F1u \F2then \F0NIL
17300			\F2else if \;
17400	\F1pred \F2a \F1u \F2then \F1fn[\F2a \F1u] . mapchoose[\F2d \F1u\;
17500	,pred,fn] \F2
17600			else \F1mapchoose[\F2d \F1u,pred,fn].\F0
17700	
17800	\J\F1lose\F0 is trivial, because the \F1mapchoose\F0 is used to make sure
17900	that only non-losing new positions are generated by \F1successors\F0.
18000	This version runs in a little less than one second on the PDP-10.  A greater
18100	speedup can be made by the application of some mathematics.  In fact, with
18200	enough mathematics, extensive tree search is unnecessary in this problem.
18300	
18400		\F1search\F0 is used when we want to search a tree of possibilities
18500	for a solution to a problem.  What if we want to find all solutions
18600	to a problem?  This can be done with a function \F1allsol\F0 that uses
18700	the same \F1lose, ter, \F0and \F1successors\F0 functions as does \F1search\F0.
18800	The simplest way to write \F1allsol\F0 is\.
18900	
19000		\F1allsol p ← \F2if \F1lose p \F2then \F0NIL \F2else if \F1ter p\;
19100	 \F2then \F1(p) \F2else \F1mapapp[successors p,allsol],\F0
19200	
19300	where
19400		\F1mapapp[u,fn] ← \F2if n \F1u \F2then \F0NIL \F2else \;
19500	\F1fn[\F2a \F1u] . mappap[\F2d \F1u,fn].\F0
19600	
19700	\JThis form of the function is somewhat inefficient because of all the
19800	\F1append\F0ing.  A more efficient form uses an auxiliary function as follows:\.
19900	
20000		\F1allsol p ← allsola[p,\F0NIL]
20100	
20200		\F1allsola[p,found] ← \F2if \F1lose p \;
20300	\F2then \F1found \F2else if \F1ter p \F2then \F1p . found\;
20400	 \F2else \F1allsolb[successors p,found],
20500	
20600		\F1allsolb[u,found] ← \F2if n \F1u \F2then \F1found \;
20700	\F2else \F1allsolb[\F2d \F1u,allsola[\F2a \F1u,found]].\F0
20800	
20900	\JThe recursive program structure that arises here is common when a list
21000	is to be formed by recurring through a list structure.\.
21100	
21200	
21300	
21400	\F46. Game trees.\F0
21500	
21600	\J	The positions that can be reached from an initial position in
21700	a  game  form  a  tree, and deciding what move to make often involves
21800	searching this tree.  However, game trees are searched in a different
21900	way  than the trees we have looked at, because the opposing interests
22000	of the players makes it not a search for a joint line  of  play  that
22100	will  lead  to  the  first  player winning, but rather a search for a
22200	strategy that will lead to a win regardless of what the other  player
22300	does.
22400	
22500		The   simplest  situation  is  characterized  by  a  function
22600	\F1successors\F0 that gives the positions that can be reached in  one
22700	move,  a  predicate  \F1ter\F0  that  tells  when a position is to be
22800	regarded  as  terminal  for  the  given  analysis,  and  a   function
22900	\F1imval\F0  that  gives  a  number  approximating  the  value  of the
23000	position to one of the  players.   We  shall  call  this  player  the
23100	maximizing  player  and  his opponent the minimizing player. Usually,
23200	the numerical values arise, because the search cannot be carried  out
23300	to the end of the game, and the analysis stops with reasonably static
23400	positions that  can  be  evaluated  by  some  rule.   Naturally,  the
23500	function  \F1imval\F0  is  chosen  to  be  easy  to  calculate and to
23600	correlate well with the probability that the  maximizing  player  can
23700	win the position.
23800	
23900		The simplest rule for finding the correct move in a position
24000	uses auxiliary functions
24100	\F1valmax\F0 and \F1valmin\F0 that give a value to a position
24200	by using \F1imval\F0 if the position is terminal and taking the max
24300	or min of the successor positions otherwise.
24400	
24500		For this we want functions for getting the maximum or the
24600	minimum of a function on a list, and they are defined as follows:\.
24700	
24800		\F1maxlis[u,f] ← \F2if n \F1u \F2then \F1-∞ \F2else \F1\;
24900	max[f[\F2a \F1u],maxlis[\F2d \F1u,f]],\F0
25000	
25100	and
25200	
25300		\F1minlis[u,f] ← \F2if n \F1u \F2then \F1∞ \F2else \F1\;
25400	min[f[\F2a \F1u],minlis[\F2d \F1u,f]].\F0
25500	
25600	\JIn these functions, -∞ and ∞ represent numbers that are smaller and
25700	larger than any actual values that will occur in evaluating \F1f\F0.
25800	If these numbers are not available, then it is necessary to prime
25900	the function with the values of \F1f\F0 applied to the first element
26000	of the list, and the functions are meaningless for null lists.
26100	Iterative versions of the functions are generally better; we give only
26200	the iterative version of \F1maxlis\F0, namely\.
26300	
26400		\F1maxlis[u,f] ← maxlisa[u,-∞,f],\F0
26500	
26600	where
26700	
26800		\F1maxlisa[u,x,f] ← \F1if n \F1u \F2then \F1x \F2else \;
26900	\F1maxlisa[\F2d \F1u,max[x,f[\F2a \F1u]],f].\F0
27000	
27100	We now have
27200	
27300		\F1valmax p ← \F2if \F1ter p \F2then \F1imval p \F2else\;
27400	\F1maxlis[successors p,valmin]\F0,
27500	
27600	and
27700	
27800		\F1valmin p ← \F2if \F1ter p \F2then \F1imval p \F2else\;
27900	\F1minlis[successors p,valmax]\F0.
28000	
28100	The best move is now chosen by
28200	
28300		\F1movemax p ← bestmax[succesors p,valmin],\F0
28400	
28500	or
28600	
28700		\F1movemin p ← bestmin[succesors p,valmax],\F0
28800	
28900	where
29000	
29100		\F1bestmax[u,f] ← bestmaxa[\F2d \F1u,\F2a \F1u,f[\F2a \F1u],f],
29200	
29300		\F1bestmaxa[u,sofar,valsofar,f] ← \F2if n \F1u \F2then \F1sofar
29400			\F2else if \F1f[\F2a \F1u] ≤ bestsofar \F2then\;
29500	\F1bestmaxa[\F2d \F1u,sofar,bestsofar,f]
29600			\F2else \F1bestmaxa[\F2d \F1u,\F2a \F1u,f[\F2a \F1u],f],
29700	
29800		\F1bestmin[u,f] ← bestmina[\F2d \F1u,\F2a \F1u,f[\F2a \F1u],f],\F0
29900	
30000	and
30100	
30200		\F1bestmina[u,sofar,valsofar,f] ← \F2if n \F1u \F2then \F1sofar
30300			\F2else if \F1f[\F2a \F1u] ≥ bestsofar \F2then\;
30400	\F1bestmina[\F2d \F1u,sofar,bestsofar,f]
30500			\F2else \F1bestmina[\F2d \F1u,\F2a \F1u,f[\F2a \F1u],f].\F0
30600	
30700	\J	However, this straight minimax computation of the best move does
30800	much more computation in general than is usually necessary.  The simplest
30900	way to see this is from the move tree of Figure 2.6.\.
31000	
31100	
31200	
31300	
31400	
31500	
31600	
31700	
31800	
31900	
32000	
32100	\CFigure 2.6
32200	
32300	\JWe see from this figure that it is unnecessary to examine the moves marked
32400	x, because their values have no effect on the value of the game or on the
32500	correct move since the presence of the 9 is sufficient information at this
32600	point to show that the move leading to the vertex preceding it should not
32700	be chosen by the minimizing player.
32800	
32900		The general situation is that it is unnecessary to examine further
33000	moves in a position once a move is found that refutes moving to the
33100	position in the first place.  This idea is called the αβ-heuristic and
33200	expressed in the following recursive definitions:\.
33300	
33400		\F1maxval p ← maxval1[p,-∞,∞],
33500	
33600		maxval1[p,α,β] ← \F2if \F1ter p \F2then \F1max[α,min[β,imval p]]\;
33700	 \F2else \F1maxval2[successors p,α,β],
33800	
33900		maxval2[u,α,β] ← {minval1[\F2a \F1u,α,β]}[λw. \F2if \F1w = β \F2then\;
34000	 \F1β \F2else \F1maxval2[\F2d \F1u,w,β]],
34100	
34200		\F1minval p ← minval1[p,-∞,∞],
34300	
34400		minval1[p,α,β] ← \F2if \F1ter p \F2then \F1max[α,min[β,imval p]]\;
34500	 \F2else \F1minval2[successors p,α,β],
34600	
34700		minval2[u,α,β] ← {maxval1[\F2a \F1u,α,β]}[λw. \F2if \F1w = α \F2then\;
34800	 \F1α \F2else \F1minval2[\F2d \F1u,α,w].\F0
34900	
35000	\J	The reduction in number of positions examined given by the αβ algorithm
35100	over the simple minimax algorithm depends on the order in which the moves are
35200	examined.  In the worst case, the moves happen to be examined in inverse order
35300	of merit in every position on the tree, i.e. the worst move first.  In that case,
35400	there is no improvement over minimax.  The best case is the one in which the
35500	best move in every position is examined first.  If we look \F1n\F0 moves
35600	deep on a tree that has \F1k\F0 successors to each position, then minimax looks
35700	at \F1k\F6n\F0 positions while αβ looks at about only \F1k\F6n/2\F0.  Thus a
35800	program that looks at 10\F64\F0 moves with αβ might have to look at 10\F68\F0
35900	with minimax.  For this reason, game playing programs using αβ make a big
36000	effort to include as good as possible an ordering of moves into the \F1successors\F0
36100	function.  When there is a deep tree search to be done, the way to make the
36200	ordering is with a short look-ahead to a much smaller depth than the main
36300	search.  Still shorter look-aheads are used deeper in the tree, and beyond
36400	a certain depth, non-look-ahead ordering methods are of decreasing complexity.
36500	
36600		A version of αβ incorporating optimistic and pessimistic evaluations
36700	of positions was first proposed by McCarthy about 1958.  Edwards and Hart at
36800	M.I.T. about 1959 proved that the present version of αβ and calculated the
36900	improvement it gives over minimax.  The first publication, however, was
37000	Brudno (1963).  It is worth noting that αβ was not used in the early chess
37100	playing programs in spite of the fact that it is clearly used in any human
37200	play.  Its non-use, therefore, represents a failure of self-observation.
37300	Very likely, there are a number of other algorithms used in human thought
37400	that we have not noticed an incorporated in our programs.  To the extent
37500	that this is so, artificial intelligence will be \F1a posteriori\F0 obvious
37600	even though it is \F1a priori\F0 very difficult.\.